Tensorflow Linear Classifier

In this article, we demonstrate implementing the Tensorflow Linear classifier model by an example. The details regarding this dataset can be found in the Diagnostic Wisconsin Breast Cancer Database.

Train and Test sets

Data Correlations

Let's take a look at the variance of the features.

Furthermore, we would like to standardize features by removing the mean and scaling to unit variance.

Train and Test sets

Input Function

The input function specifies how data is converted to a tf.data.Dataset that feeds the input pipeline in a streaming fashion. Moreover, an input function is a function that returns a tf.data.Dataset object which outputs the following two-element tuple:

Moreover, an estimator model consists of two main parts, feature columns, and a numeric vector. Feature columns provide explanations for the input numeric vector. The following function separates categorical and numerical columns (features)and returns a descriptive list of feature columns.

Estimator using the default optimizer

Predictions

ROC Curves

Confusion Matrix

Estimator using the FTRL optimizer with regularization.

The Follow the Regularized Leader (FTRL) model is an implementation of the FTRL-Proximal online learning algorithm for binomial logistic regression (for details see [6]).

Predictions

ROC Curves

Confusion Matrix

Estimator using an optimizer with a learning rate decay

Predictions

ROC Curves

Confusion Matrix


References

  1. Regression analysis Wikipedia page
  2. Tensorflow tutorials
  3. W.N. Street, W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905, pages 861-870, San Jose, CA, 1993.

  4. O.L. Mangasarian, W.N. Street and W.H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4), pages 570-577, July-August 1995.

  5. W.H. Wolberg, W.N. Street, and O.L. Mangasarian. Machine learning techniques to diagnose breast cancer from fine-needle aspirates. Cancer Letters 77 (1994) 163-171.

  6. Online machine learning Wikipedia page
  7. Learning rate Wikipedia page